Provisioned Concurrency in AWS Lambda
Provisioned Concurrency in AWS Lambda is a feature that keeps a pre-defined number of Lambda instances initialized and ready to respond immediately. Unlike standard Lambda invocations where a cold start may occur, provisioned concurrency eliminates cold starts by always keeping the environment warm and pre-initialized. This is especially useful for latency-sensitive applications where response time is critical.
For APIs or web applications requiring consistently low latency
During traffic spikes or predictable high-traffic periods
For scheduled tasks that must start immediately without cold start delays
In high-performance backend processing where startup time impacts user experience
You define the number of pre-warmed instances for a specific version or alias of your Lambda function
AWS continuously keeps that number of instances initialized
Requests are routed to these pre-initialized instances, avoiding cold starts